Synchronous Linear Context-Free Rewriting Systems for Machine Translation
نویسنده
چکیده
We propose synchronous linear context-free rewriting systems as an extension to synchronous context-free grammars in which synchronized non-terminals span k ≥ 1 continuous blocks on each side of the bitext. Such discontinuous constituents are required for inducing certain alignment configurations that occur relatively frequently in manually annotated parallel corpora and that cannot be generated with less expressive grammar formalisms. As part of our investigations concerning the minimal k that is required for inducing manual alignments, we present a hierarchical aligner in form of a deduction system. We find that by restricting k to 2 on both sides, 100% of the data can be covered.
منابع مشابه
Hierarchical Machine Translation With Discontinuous Phrases
We present a hierarchical statistical machine translation system which supports discontinuous constituents. It is based on synchronous linear context-free rewriting systems (SLCFRS), an extension to synchronous context-free grammars in which synchronized non-terminals span k ≥ 1 continuous blocks on either side of the bitext. This extension beyond contextfreeness is motivated by certain complex...
متن کاملOptimal Reduction of Rule Length in Linear Context-Free Rewriting Systems
Linear Context-free Rewriting Systems (LCFRS) is an expressive grammar formalism with applications in syntax-based machine translation. The parsing complexity of an LCFRS is exponential in both the rank of a production, defined as the number of nonterminals on its right-hand side, and a measure for the discontinuity of a phrase, called fan-out. In this paper, we present an algorithm that transf...
متن کاملPushdown Machines for Weighted Context-Free Tree Translation
Synchronous context-free grammars (or: syntax-directed translation schemata) were introduced in the context of compiler construction in the late 1960s [12]. They define string transductions by the simultaneous derivation of an input and an output word. In contrast, modern systems for machine translation of natural language employ weighted tree transformations to account for the grammatical stru...
متن کاملMachine Translation Based on Constraint-Based Synchronous Grammar
This paper proposes a variation of synchronous grammar based on the formalism of context-free grammar by generalizing the first component of productions that models the source text, named Constraint-based Synchronous Grammar (CSG). Unlike other synchronous grammars, CSG allows multiple target productions to be associated to a single source production rule, which can be used to guide a parser to...
متن کاملSynchronous Context-Free Tree Grammars
We consider pairs of context-free tree grammars combined through synchronous rewriting. The resulting formalism is at least as powerful as synchronous tree adjoining grammars and linear, nondeleting macro tree transducers, while the parsing complexity remains polynomial. Its power is subsumed by context-free hypergraph grammars. The new formalism has an alternative characterization in terms of ...
متن کامل